# Reinforcement learning reward model
POLAR 7B
Apache-2.0
POLAR-7B is a scalar reward model based on large-scale pretraining. It adopts an innovative policy discriminative learning paradigm and can effectively distinguish policies and align with human preferences.
Large Language Model
Transformers Supports Multiple Languages

P
internlm
316
15
Japanese Novel Reward Modernbert Ja 310m
MIT
A reward model for Japanese novel quality assessment fine-tuned from modernbert-ja-310m, used to predict user evaluations of novel texts.
Large Language Model
Transformers Japanese

J
Aratako
15
1
Featured Recommended AI Models